Supplementary Material for “Adversarial Inverse Graphics Networks: Learning 2D-to-3D Lifting and Image-to-Image Translation from Unpaired Supervision”
نویسندگان
چکیده
Here we discuss the benefits of using non-parametric and domain-specific renderers, over learned decoders. Both the proposed model and CycleGAN [5] can be viewed as autoencoders: the input is first transformed into a target domain, and then transformed back to its original space. A parametric decoder could be more desirable, for the reason that we do not need to hand-engineer a mapping function from the target domain back to the inputs. However, simply using reconstruction loss and adversarial loss does not guarantee that the predictions look spatially similar to the inputs. In tasks such as image-to-image translation, spatial precision can be of critical importance. With a parametric decoder, the transformed input can be viewed as a information bottleneck, and as long as the decoder can correctly “guess” the final output from the transformed input (i.e., the code), the code is valid and the solution is optimal. To support this point, we conduct an experiment on image inpainting using the MNIST dataset. Similar to the parametric encoder-decoder described in the main text, the network has two main parts: (1) an encoder that transforms the input (a partially obscured image of a digit) into prediction (a hallucinated digit), and (2) a decoder that transforms the prediction back into the input. Instead of using convolutional layers, which have an architectural bias on preserving spatial relationships, we use fully-connected layers in both the encoder and the decoder. This is important, because such architectural conveniences are unavailable in less-structured tasks, such as 3D pose prediction and SfM. We train the model with a reconstruction loss on the decoder, and adversarial loss on the encoder. The results are shown in Figure 1. While inpainting, the encoder (incorrectly) transforms many of the digits into other digits. For instance, several obscured “1” images are inpainted as “4”. In the parametric decoding process, however, these errors are undone, and the original input is re-
منابع مشابه
Unsupervised 3D Reconstruction from a Single Image via Adversarial Learning
Recent advancements in deep learning opened new opportunities for learning a high-quality 3D model from a single 2D image given sufficient training on large-scale data sets. However, the significant imbalance between available amount of images and 3D models, and the limited availability of labeled 2D image data (i.e. manually annotated pairs between images and their corresponding 3D models), se...
متن کاملImprovement of generative adversarial networks for automatic text-to-image generation
This research is related to the use of deep learning tools and image processing technology in the automatic generation of images from text. Previous researches have used one sentence to produce images. In this research, a memory-based hierarchical model is presented that uses three different descriptions that are presented in the form of sentences to produce and improve the image. The proposed ...
متن کاملUltra-Fast Image Reconstruction of Tomosynthesis Mammography Using GPU
Digital Breast Tomosynthesis (DBT) is a technology that creates three dimensional (3D) images of breast tissue. Tomosynthesis mammography detects lesions that are not detectable with other imaging systems. If image reconstruction time is in the order of seconds, we can use Tomosynthesis systems to perform Tomosynthesis-guided Interventional procedures. This research has been designed to study u...
متن کاملWeakly Supervised Generative Adversarial Networks for 3D Reconstruction
Supervised 3D reconstruction has witnessed a significant progress through the use of deep neural networks. However, this increase in performance requires large scale annotations of 2D/3D data. In this paper, we explore inexpensive 2D supervision as an alternative for expensive 3D CAD annotation. Specifically, we use foreground masks as weak supervision through a raytrace pooling layer that enab...
متن کاملTriangle Generative Adversarial Networks
A Triangle Generative Adversarial Network (∆-GAN) is developed for semisupervised cross-domain joint distribution matching, where the training data consists of samples from each domain, and supervision of domain correspondence is provided by only a few paired samples. ∆-GAN consists of four neural networks, two generators and two discriminators. The generators are designed to learn the two-way ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017